Method and apparatus for coding decoding items of subtitling data

ABSTRACT

Subtitling can be based on either pixel data or on character data. Character data allow very efficient encoding, but from character strings alone, subtitling can not be converted into a graphical representation to be overlaid over video. The intended character set, font and e.g. font size, must either be coded explicitly within the subtitling bitstream or an implicit assumption must be made about them. In pixel-based subtitling, subtitling frames are conveyed directly in the form of graphical representations by describing them as (typically rectangular) regions of pixel values on the AV screen, at the cost of considerably increased bandwidth for the subtitling data. According to the invention, a font memory is used that allows an efficient realization of pixel-based subtitle lettering, because the glyphs need only be transmitted once and thereafter are referenced by relatively compact character references during the AV event. Thereby the invention combines the advantages of pure pixel-based and pure-character-based subtitling schemes, while mostly avoiding their respective shortcomings.

The invention relates to a method and to an apparatus forcoding/decoding items of subtitling data, in particular subtitling andgraphics for Blu-ray disc optical storage and recording.

BACKGROUND

In the area of subtitling for pre-recorded Audio-Visual (AV) material,conflicting requirements exist: On one hand, subtitling data should beefficiently encoded, especially if a whole bouquet of subtitlingservices is to be provided for any given AV material. In this case, atleast on average, very few bits are available per subtitling character.On the other hand, professional content owners want to have full controlover the appearance of subtitling characters on screen, additionallythey want to have at their command a rich set of special display effectsfrom simple fading all through to genuine animations. Such high degreeof design freedom and command normally is feasible only with high orvery high subtitling bandwidth.

Two main approaches exist in today's state of the art for subtitlingpre-recorded AV data signals with separate subtitling information:Subtitling can be based on either pixel data or on character data. Inboth cases, subtitling schemes comprise a general framework, which forinstance deals with the synchronisation of subtitling elements along theAV time axis.

In the character-based subtitling approach, e.g. in the TELETEXT system(see ETSI: ETS 300 706 Enhanced Teletext specification, May 1997) forEuropean analog or digital TV, strings are described by sequences ofletter codes, e.g. ASCII (see ISO/IEC 8859: American Standard Code forInformation Interchange—ASCII) or UNICODE (see ISO/IEC 10646:Information technology—Universal Multiple-Octet Coded Character Set(UCS)), which intrinsically allows for a very efficient encoding. Butfrom character strings alone, subtitling can not be converted into agraphical representation to be overlaid over video. For this, theintended character set, font and some font parameters, most notably thefont size, must either be coded explicitly within the subtitlingbitstream or an implicit assumption must be made about them within asuitably defined subtitling context. Also, any subtitling in thisapproach is confined to what can be expressed with the letters andsymbols of the specific font or fonts in use.

The DVB Subtitling specification (see ETSI: ETS 300 743 Digital VideoBroadcasting (DVB); Subtitling systems, September 1997, and EP-A-0 745307: Van der Meer et al, Subtitling transmission system), with itsobject types of ‘basic object, character’ or ‘composite object, stringof character’, constitutes another state-of-the-art example ofcharacter-based subtitling.

In the pixel-based subtitling approach, subtitling frames are conveyeddirectly in the form of graphical representations by describing them as(typically rectangular) regions of pixel values on the AV screen.Whenever and wherever anything is meant to be visible in the subtitlingplane superimposed onto video, its pixel values must be encoded andprovided in the subtitling bitstream, together with appropriatesynchronisation info. Obviously removing any limitations inherent with3rd party defined fonts, the pixel-based approach carries the penalty ofa considerably increased bandwidth for the proper subtitling data.Examples of pixel-based subtitling schemes can be found in DVD's‘Sub-picture’ concept (see DVD Forum: DVD Specifications for Read-OnlyDisc/Part 3 Video Specifications/Version 1.0 August 1996) as well as inthe ‘bitmap object’ concept of DVB Subtitling (see ETS 300 743 andEP-A-0 745 307 mentioned above).

INVENTION

A problem to be solved by the invention is to combine the efficientencoding of character-based subtitling with full control over theappearance of subtitling characters as is feasible with pixel-basedsubtitling, without significantly increasing the data amount requiredfor transferring the necessary information. This problem is solved bythe methods disclosed in claims 1 and 7. An apparatus that utilises themethod of claim 1 is disclosed in claim 4.

The invention is based on a pixel-based subtitling scheme. Thissubtitling system includes several components which allow to includefont support into an otherwise pixel-based subtitling scheme. This fontsupport includes:

-   a.1) A structure for Font Describing Data for efficiently describing    a set of font characters in pixel data form;-   a.2) A structure for Font Identification Data to uniquely identify a    predefined font to be used;-   a.3) A concept of having a font memory as a part of the overall    memory area, wherein that font memory is dedicated to hold the font    characters, and is not directly visible in the AV output;-   a.4) A structure for Character Referencing Data for efficiently    referencing individual font characters from amongst the font or    fonts stored in the font memory.

Font Describing Data as well as Character Referencing Data aretransmitted or stored alongside AV data, whereby that transmission orstorage has either the format of a nearly inseparable mix or usescompletely separate transmission channels or storage locations, or is amix of both. At decoder side the Font Describing Data cause a set ofarbitrary character glyphs (graphical representation of a character) orother graphics building blocks to be loaded into the font memory. Thenumber and design of character glyphs to be used in each individual caseis completely under the control of the content provider.

According to the invention, the Font Describing Data consist of one ormore character parameter parts each comprising character parameter setsof one ore more characters in the font and one or more character pixeldata parts each comprising the pixel data of one or more characters inthe font. The pixel data of a character are represented as a characterarray, i.e. as a rectangular array of pixel values, the array having awidth and a height specific to the character. Each one of said characterparameter sets includes any combination of:

-   c.1) The width of the character array;-   c.2) The height of the character array;-   c.3) The start address of the pixel data of the character relative    to the character pixel data part containing it;-   c.4) A horizontal offset between the boundaries of the array and a    character reference point;-   c.5) A vertical offset between the boundaries and the character    reference point;-   c.6) A horizontal increment describing the horizontal distance    between the character and those characters to either precede or    succeed it.

The inventive use of a font memory provides an efficient realisation ofpixel-based subtitle lettering, because the glyphs need only betransmitted once and thereafter are referenced by relatively compactcharacter references during the AV event.

On the other hand, because glyphs are effectively provided inpixel-based form, the appearance of subtitling is entirely put undercontent provider's control, and all problems of font identification,font selection, font parametrisation and character rendering, whichnormally come with character-based schemes, are avoided advantageously.

In this way, the invention actually combines the advantages of purepixel-based and pure-character-based subtitling schemes, while mostlyavoiding their respective shortcomings.

In principle, the inventive method is suited for decoding items ofsubtitling data, including the steps:

-   -   retrieving items of Character Referencing Data that are related        to corresponding parts of a video or audio-visual data signal        which data items describe sequences of characters as well as        information about where in pictures of said data signal and/or        when and/or how to make the referenced characters visible using        a display memory;    -   deriving from said items of Character Referencing Data items of        Character Selecting Information and Character Positioning        Information;    -   reading pixel data of said referenced characters as designated        by said items of Character Selection Information from a font        memory;    -   writing said pixel data into said display memory as designated        by said items of Character Positioning Information.

In principle the inventive apparatus is suited for decoding items ofsubtitling data, said apparatus including:

-   -   means for retrieving items of Character Referencing Data that        are related to corresponding parts of a video or audiovisual        data signal, which data items describe sequences of characters        as well as information about where in pictures of said data        signal and/or when and/or how to make the referenced characters        visible using a display memory;    -   means for:        -   deriving from said items of Character Referencing Data items            of Character Selecting Information and Character Positioning            Information;        -   reading pixel data of said referenced characters as            designated by said items of Character Selection Information            from a font memory;        -   writing said pixel data into said display memory as            designated by said items of Character Positioning            Information.

Advantageous additional embodiments of the invention are disclosed inthe respective dependent claims.

DRAWINGS

Exemplary embodiments of the invention are described with reference tothe accompanying drawings, which show in:

FIG. 1 Inventive data structure;

FIG. 2 Block diagram of the inventive subtitling system;

FIG. 3 Example data structure for embedding a ‘font_id’ into a DVD-ST‘object_data_segment’.

EXEMPLARY EMBODIMENTS

As illustrated in FIG. 1, the Font Describing Data 102 as well as theCharacter Referencing Data 103 are transferred, stored or recordedtogether with related-AV data 101, whereby the transmission or storagecan be anything between a nearly inseparable mix and the use ofcompletely separate transmission channels or storage locations.

At decoder side, as shown in FIG. 2, a subtitling stream 201 passesthrough data separation means 202, which in turn provides CharacterReferencing Data 203 and Font Describing Data 204. By passing a fontdescribing data processing means 205, the Font Describing Data 204 causea set of arbitrary character glyphs or other graphics building blocks tobe loaded into a font memory 208.

Advantageously, the number and design of character glyphs to be used ineach individual use case is completely under content provider's control.

Optionally, to a font thus described and loaded into font memory 208,the above-mentioned Font Identification Data can be associated.

The Character Referencing Data 203 cause character referencing dataprocessing means 206 to copy individual subsets of the set of characterglyphs denoted Character Describing Data 209 from font memory 208 into adisplay memory 207, which can be a part of the overall system memory.The content of display memory 207 gets overlaid onto video and hencebecomes a visible subtitle.

Optionally, the Character Referencing Data can contain references to theFont Identification Data, thus allowing a subtitling decoder to decidewhether a font required for rendering a specific subtitling stream muststill be loaded into font memory 208, or is already available forimmediate use.

Possible uses and modes of operation of the proposed subtitling systemcan include, but are not limited to, one of:

-   b.1) Pre-loading at least one font for use throughout a long AV    program;-   b.2) Use of fonts containing more than one variant for at least one    of the letters, the use of which includes, but is not limited to,    subpixel-accurate letter positioning or emphasis (bold/italic)    support;-   b.3) Loading font subsets for parts of AV material (e.g. movie    chapters) in cases where sparse subsets of big fonts are used, like    e.g. Asian fonts.

For the further structure of the Font Describing Data, several variantsof specific embodiment are proposed as follows.

In a first variant, if the font is a proportional font where individualcharacters have variable width, all the character arrays arehorizontally padded to be nominally of equal width, and the resultingpadded character arrays are vertically concatenated into a font array.The font array is then line-scanned in conventional way to form a singlecharacter pixel data part.

In another variant, all character arrays are vertically padded to benominally of equal height, and the resulting padded character arrays arehorizontally concatenated into a font array. The font array is thenline-scanned in conventional way into a single character pixel datapart.

For both above variants, the single character pixel data part ispreceded by a single character parameter part comprising the characterparameter sets of all characters in the font.

In another variant, the Font Describing Data are generated byalternately concatenating the character parameter sets and the characterarrays, for all characters in the font.

In another variant, the Font Describing Data are generated by firstconcatenating all the character parameter sets into a single characterparameter part, and appending to that part a single character pixel datapart comprising all the character arrays.

In another variant, which may optionally extend all above variants, aUNICODE (see ISO/IEC 10646: Information technology—UniversalMultiple-Octet Coded Character Set (UCS)) code is associated to some orall of the characters of the font, and the UNICODE code is inserted andincluded at an identifiable position within that part of the FontDescribing Data which is associated with the character in question.

In another variant, which may optionally extend all above variants, anon-repetitive character identifier is associated to every character ofthe font, and the identifier is inserted and included at an identifiableposition within that part of the Font Describing Data which isassociated with the character in question.

In all above variants, the Font Describing Data can either be

-   d.1) directly transmitted using one codeword per data item, or they    can be-   d.2) compressed by runlength coding, or they can be-   d.3) compressed by other methods for lossless compression such as    the ‘zlib’ method used in PNG (see W3C recommendation, PNG (Portable    Network Graphics) Specification, Version 1.0, 1996,    http://www.w3.org/TR/REC-png.pdf).

For the structure of the Font Identification Data, several variants ofspecific embodiment are proposed as follows. In a first variant, theFont Identification Data structure is embodied as a ‘font_id’ as definedin the ‘Portable Font Resource’ (PFR) system (see Bitstream Inc.:TrueDoc PFR Specification, http://www.bitstream.com/pfrspec/index.html).

In another variant, the Font Identification Data structure in the formof a PFR ‘font_id’ is embodied into the abovementioned DVB subtitlingsystem, using a data structure as illustrated in FIG. 3.

In another variant, the Font Identification Data structure is embodiedas a “Universally Unique Identifier” as defined in (UUID in: ISO/IEC11578:1996, Information technology—Open Systems Interconnection—RemoteProcedure Call (RPC)).

In the context of the invention, the Character Referencing Data consistof a sequence of one or more character reference groups each accompaniedby group positioning data, and each character reference group consistsof a sequence of one or more character references each accompanied bycharacter positioning data.

The group positioning data can preferably be embodied as one of:

-   e.1) Absolute horizontal and vertical coordinates of a group    reference point relative to the origin of the video image;-   e.2) Relative horizontal and vertical coordinates of the group    reference point relative to the group reference point of the    previous character reference group;-   e.3) Relative horizontal and vertical coordinates relative to any    other prescribed reference point.

The character references can preferably be embodied as one of:

-   f.1) Character indexes referring to the implicit position of the    designated character within the Font Describing Data;-   f.2) Any kind of unambiguous character identifiers;-   f.3) ASCII codes if they have been unambiguously assigned to the    characters;-   f.4) UNICODE codes if they have been unambiguously assigned to the    characters.

The character positioning data can preferably be embodied as one of:

-   g.1) An automatic advance needing no additional individual character    positioning data, the advance being deductible from the position of    the character reference point of the previous character and from the    horizontal increment of the character in question;-   g.2) An automatic advance with character position offset data, where    for the horizontal as well as for the vertical position of the    character a first value deduced from the position of the character    reference point of the previous character and from the horizontal    increment of the character in question is added with a second value    which is individually described in the character positioning data;-   g.3) Relative character positioning data applied relative to the    character reference point of the previous character;-   g.4) Absolute character positioning data applied relative to the    video image origin.

1. Method for decoding items of subtitling data, characterised by thesteps: retrieving (202) items of Character Referencing Data (103, 203)that are related to corresponding parts of a video or audio-visual datasignal (101), which data items (103, 203) describe sequences ofcharacters as well as information about where in pictures of said datasignal and/or when and/or how to make the referenced characters visibleusing a display memory (207); deriving (206) from said items ofCharacter Referencing Data (103, 203) items of Character SelectingInformation and Character Positioning Information; reading (206) pixeldata of said referenced characters as designated by said items ofCharacter Selection Information from a font memory (208); writing (206)said pixel data into said display memory (207) as designated by saiditems of Character Positioning Information.
 2. Method according to claim1, wherein the following steps are carried out before retrieving (202)said items of Character Referencing Data (103, 203): retrieving (202)items of Font Describing Data (102, 204) related to corresponding onesof said items of Character Referencing Data (103, 203); writing (205)said items of Font Describing Data into said font memory (208). 3.Method according to claim 1 or 2, wherein, after retrieving said itemsof Character Referencing Data (103, 203), the following steps arecarried out: checking whether or not said pixel data of said referencedcharacters are already stored in said font memory (208); if not true,retrieving (202) such items of Font Describing Data (102, 204) whichcontain said referenced characters; writing said items of FontDescribing Data into said font memory (208).
 4. Apparatus for decodingitems of subtitling data, said apparatus including: means (202) forretrieving items of Character Referencing Data (103, 203) that arerelated to corresponding parts of a video or audio-visual data signal(101), which data items (103, 203) describe sequences of characters aswell as information about where in pictures of said data signal and/orwhen and/or how to make the referenced characters visible using adisplay memory (207); means (206) for: deriving from said items ofCharacter Referencing Data (103, 203) items of Character SelectingInformation and Character Positioning Information; reading pixel data ofsaid referenced characters as designated by said items of CharacterSelection Information from a font memory (208); writing said pixel datainto said display memory (207) as designated by said items of CharacterPositioning Information.
 5. Apparatus according to claim 4, wherein saidmeans (202) for retrieving, before retrieving said items of CharacterReferencing Data (103, 203), retrieve items of Font Describing Data(102, 204) related to corresponding ones of said items of CharacterReferencing Data (103, 203), said apparatus further including: means(205) for writing said items of Font Describing Data into said fontmemory (208).
 6. Apparatus according to claim 4 or 5, further includingmeans for checking, after retrieving said items of Character ReferencingData (103, 203), whether or not said pixel data of said referencedcharacters are already stored in said font memory (208), wherein, if nottrue, such items of Font Describing Data (102, 204) are retrieved thatcontain said referenced characters, and are written into said fontmemory (208).
 7. Method for encoding subtitling data, characterised bythe step: attaching to a video or audio-visual data signal (101) relatedsubtitling data including items of Character Referencing Data (103, 203)and items of Font Describing Data (102, 204), whereby said items ofCharacter Referencing Data (103, 203) describe sequences of charactersas well as information about where in pictures of said data signaland/or when and/or how to make the referenced characters visible using adisplay memory, said items of Character Referencing Data including itemsof Character Selecting Information and Character PositioningInformation, wherein said items of Character Selection Information canbe used in a subtitle decoder for reading pixel data of said referencedcharacters from a font memory and said items of Character PositioningInformation can be used in said subtitle decoder for writing said pixeldata into said display memory, and whereby said items of Font DescribingData (102, 204) can be written in said subtitle decoder into said fontmemory for checking whether or not said pixel data of said referencedcharacters are already stored in said font memory and, if not true,retrieving such items of Font Describing Data (102, 204) which containsaid referenced characters and writing said items of Font DescribingData into said font memory.
 8. A data carrier containing a video oraudio-visual data signal (101) and related subtitling data that areencoded using a method according to claim 7.